[Mali\*et al., 5(8): August, 2016] ISSN: 2277-9655 ICTM Value: 3.00 Impact Factor: 4.116



# INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY

# PERFORMANCE OPTIMIZATION OF LUT IN FPGA USING CNFET

Manasi R. Mali\*, Dr. S.D Pable, Prof R.S Khule

Department of Electronics and Telecommunication, Matoshri College of Engineering & Research centre, ,Savitribai University of Pune, India.

**DOI**: 10.5281/zenodo.60127

# **ABSTRACT**

A Leakage power dissipation is becoming a concern in field-programmable gate arrays (FPGAs) due to scaling in FPGA technology. Field programmable gate arrays (FPGAs) are the implementation platform of choice when it comes to design flexibility. However, the high power consumption of FPGAs (which arises due to their flexible structure), make them less appealing for extreme low power applications hence it is important to investigate ways of reducing FPGA power consumption. This paper proposes an energy efficient dual-threshold CarbonNanotube Field Effect Transistor (CNFET) based architecture of 4-input Look-up Table (LUT), a building block of Field programmable gate arrays. HSPICE simulation based on Berkeley Predictive Technology Model (BPTM) for 32nm channel length, in the CNFET based LUT delay is improved by 95% and is 98% more leakage power efficient than the LUT implemented in the bulk CMOS.

KEYWORDS: FPGA, CNFET, LUT

## **INTRODUCTION**

Leakage power dissipation has been an area of concern in CMOS technology for some time now. Leakage power dissipation occurs as a result of undesirable currents flowing through the CMOS transistors when they are scaled down. With scaling, it is now possible to have FPGAs with a high density of devices. As many of these devices sit idle for long periods of time, they contribute to increased static power dissipation that is mainly caused due to leakage currents. Leakage power dissipation has grown to be a significant fraction of overall chip power dissipation in modern processes and it is expected to grow significantly in future processes [1], [2]. Due to increasing complexity of modern digital designs and because of low NRE (non recurring engineering) cost and short time to market, field programmable gate arrays (FPGAs) have become an attractive implementation option [3]. FPGA's use much more transistors per function than application specific integrated circuits to achieve programmability resulting in higher leakage power consumption [4]. Most of the early work on low power FPGA's was focused on dynamic power consumption, but, now leakage power has become almost 50% of the total FPGA power [5]. FPGA's consist of an array of logic blocks that are connected through the routing switches. The logic blocks are composed of LUT's and flip-flops [6], [7], FPGA block diagram is shown in figure 1 below. Most of the research work till now is concentrated on reducing the leakage within the routing switches, which accounts for more than 60% of total FPGAs leakage [8]. The leakage power of LUT which currently comprises ~25% of total chip power, has also become equally important with the advent of new commercial FPGA's such as Altera's Stratix-III and Xilinx Vertex-5 which uses larger LUT [9], [10].



IC<sup>TM</sup> Value: 3.00

ISSN: 2277-9655 Impact Factor: 4.116



Figure 1: Block Diagram of FPGA

This paper explores the subthreshold performance of novel devices for FPGA based LUT in deep submicron 32nm technology node for ultra low power applications. In this paper we have implemented a 4-input LUT in Carbon nanotube based field effect transistor (CNFET) technology. Due to superior conductance, very high ION/IOFF ratio, high drive current and high thermal stability, CNFET will be the best choice to minimize the leakage power in future FPGA's [9].

#### **LOOK-UP TABLE**

The basic 3-input LUT structure is shown in Figure 1. This is similar to the traditional LUT design [9]. The output of the LUT is connected to one of 8 SRAMs through a path that is controlled by pass gates. The 8 SRAMs store the logic function of the LUT. Each input turns on exactly one path between some SRAM cell and the output. The number of stages in the LUT is equal to the number of inputs of the LUT, as depicted in Figure 1. The stages are labeled in Figure 2. In general, a larger number of inputs of the LUT is desired as it gives the LUT the ability to be configured with a more complex logic function. The increased number of stages required as the number of inputs of the LUT increases, causes an increase in the number of pass gates the signals must propagate through before they can appear at the output. In our design, we chose an NMOS pass gate-based LUT instead of a CMOS pass gate-based LUT, because the extra PMOS device in the CMOS pass gate will increase area and capacitance, thus degrading its performance and area efficiency as was shown in [9]. However, in super-threshold operation, the use of an NMOS pass gate-based LUT results in a drop in the output voltage of the first stage by threshold voltage  $V_T$ . This can be corrected by the use of a restoring buffer P at the output, yielding a full voltage swing at the output. The output of the inverter P is connected to both a flip-flop and a MUX. The MUX chooses between a non-registered versus a registered output of the LUT. The output of the MUX is then regenerated by means of a low-switching point inverter Q connected to a PMOS keeper, as shown on the right of Figure 2.

The whole LUT circuitry is implemented in the regular threshold voltage (Rvt). Fig. 3 shows the circuit of level restoring buffer here the transistors MN1 and MP2 which are critical for the rising signal transition are implemented in regular threshold voltage ( $R_{VT}$ ), while MP1 and MN2 implemented in high threshold voltage. With this balanced rising and falling technique, compared to base-line level restoring buffer, a delay penalty of only 11% with substantial 69% and 64% reduction in active and standby leakage respectively have been achieved[2]. All the Mux Tree transistors have regular threshold voltage on account of high delay penalty by using high threshold voltage  $H_{VT}$  for these transistors.



ISSN: 2277-9655 ICTM Value: 3.00 **Impact Factor: 4.116** 



Figure 2: 3 input LUT



Figure 3: Level Restoring Buffer

To achieve high integration density CMOS semiconductor devices are scaled aggressively. But leakage power dissipation and variability are the drawbacks of scaling. To overcome these drawbacks of scaling we use CNFET technology for LUT in next section.

# PERFORMANCE ANALYSIS USING CNFETS CARBON NANOTUBE FIELD EFFECT TRANSISITOR

## Carbon Nanotube Field Effect Transistor

The The research community actively investigates CNFETs as a promising device for integrated circuit technology at the end or beyond the ITRS roadmap. The most promising devices among emerging technologies is CNFET. Most of the fundamental limitations for traditional MOSFETs are reduced in CNFETs. The single wall carbon nanotube (SWCNTs) is one-dimensional conductor obtained by sheets of graphene rolled in the form of tubes. Fig. 4 shows a CNFET structure. It has a similar structure as MOSFET. It has a structure that has bulk dielectric material on top of silicon substrate and uses SWNT as the channel. Depending on the structure, SWNT will have either metallic or semiconducting property. This property is described by the chiral vector (n, m), where n and m are integers of some vector equation. When the difference between n and m is either 0 or amultiple of 3's, SWNT has metallic property,



[Mali\**et al.*, 5(8): August, 2016] IC<sup>TM</sup> Value: 3.00

and it has semiconducting property otherwise.

ISSN: 2277-9655 Impact Factor: 4.116



Figure 4: 3D CNFET Structure

Furthermore, the chiral vector influences the diameter of SWNT. The diameter of CNT is given by equation (1),

$$D_{CNT} = \frac{a\sqrt{m^2 + mn + n^2}}{\pi}$$
(1)

Where m, n are chirality number of CNT, 'a' is the Lattice constant (a=2.49e<sup>-10</sup>). The Threshold voltage of CNFET is given by equation [9],

$$V_{TH} = \frac{\sqrt{3}}{3} \frac{aV\pi}{D_{CNT}}$$
(2)

Where,  $V\pi = 3.03q$  is the carbon PI-PI bond energy.

#### Dual Threshold CNFET based LUT

We implement dual threshold 4-input LUT using CNFET similar as shown in figure 2. using MOSFET. The bandgap Eg of CNFET depends strongly on its diameter (d) as  $Eg \cong \Delta 1/d$ , thus for conduction to start, the barrier at the source channel has to overcome  $Eg/2=\Delta 1$ . By changing the diameter of CNT we can vary the threshold voltage (VT) of CNFET because the barrier height determines the threshold potential of CNFET therefore Vth  $=\Delta 1/d$ , hence this approach is used to boost VT by 0.1v, so as to consider high threshold voltage (Hvt) transistor throughout this text, this can be achieved by selecting the chiralities (13,0) for simulation. Hence we use chirality (13,0) for level restoring buffer and chirality (19,0) is used for LUT circuitry for simulation. Table below shows performance comparison Dual Vt and Single VT CNFET LUT. Dual VT CNFET LUT gives 62% improvement in power and 65% improvement in delay.

Table 1. Comparison of single  $V_T$  and dual  $V_T$  CNFET LUT

| Sr no | Parameter     | Single V <sub>T</sub> CNFET LUT | Dual V <sub>T</sub> CNFET LUT |  |
|-------|---------------|---------------------------------|-------------------------------|--|
| 1.    | Average power | 1.90nW                          | 0.72nW                        |  |
| 2.    | Delay         | 0.44ps                          | 0.15ps                        |  |

To evaluate the performance benefit of CNFET for FPGAs a dual threshold CNFET LUT is implemented in bulk and CNFET transistors at 32nm technology. The selected (W/L) ratio for NMOS and PMOS transistors is 2 and 4 respectively. Similarly a (19, 0) chirality of CNFET each with 3 nanotubes for N and P CNFET is considered over here for comparison. Similarly the level restoring buffer of respective LUT is sized up interms of width and number of CNTs. Simulation is carried out at frequency of 770MHz with supply voltage ranging from 0.2 V TO 0.4 V. delay

ICTM Value: 3.00

and power of both LUT is observed shown in table 2 and 3.

Table 2. Delay Comparison of CMOS and CNFET LUT

ISSN: 2277-9655

**Impact Factor: 4.116** 

| Vdd  | CMOS LUT<br>delay(ns) | CNFET LUT<br>delay(ns) |
|------|-----------------------|------------------------|
| 0.2  | 201                   | 2.02                   |
| 0.25 | 92                    | 0.9                    |
| 0.3  | 51.1                  | 0.54                   |
| 0.35 | 30.5                  | 0.19                   |
| 0.4  | 20                    | 0.086                  |

Table 2. Comparison of power consumed by CMOS and CNFET LUT

| Vdd  | CMOS LUT<br>power(nW) | CNFET LUT power(nW) |
|------|-----------------------|---------------------|
| 0.2  | 10.4                  | 0.5                 |
| 0.25 | 17                    | 0.75                |
| 0.3  | 27.7                  | 0.69                |
| 0.35 | 43.5                  | 1.04                |
| 0.4  | 68.6                  | 1.34                |



ICTM Value: 3.00



ISSN: 2277-9655

Figure 5: PDP as function of supply voltage

Due to higher mobility and ballistic transport of CNFET, the LUT implemented with CNFET has high speed shown in Table 2. Similarly the effective width of CNFET is very small therefore switching power consumption of CNFET LUT is lower than bulk LUT shown in Table 3. Due, to lower delay and power consumption the PDP of CNFET LUT is lower than that of implemented in bulk. As the parasitic load of LUT circuit is high therefore for higher  $V_{DD}$  range (from 0.2V to 0.4V) the PDP decreases,thereafter for lower  $V_{DD}$  (i.e. at '0.2V') the delay increases abruptly and PDP also increased. This is due to longer critical path of 4- input LUT (i.e. 16:1 multiplexer which is sub-block of LUT). The optimum PDP obtained for both LUT is at  $V_{DD}$ =0.4V. Due to advantage of lower power consumption and higher speed of CNFET, the CNFET based LUT provides 97% improvement in optimum PDP compared to CMOS LUT as depicted in Figure 5.

# **CONCLUSION**

In this paper 32nm bulk and CNFET based technology is used to evaluate the leakage performance of 4-input Look-up Table. Based on HSPICE simulation we find that by using dual threshold CNFET based LUT power is improved by 62% and delay improved by 65% compared to single threshold CNFET based LUT. CNFET based LUT are 98% more leakage power efficient and delay is improved by 95% than the LUT implemented in the bulk technology. Due to advantage of lower power consumption and higher speed of CNFET, the CNFET based LUT provides 97% improvement in optimum PDP compared to CMOS LUT as depicted in Figure 5. This shows that CNFET holds a lot of promise as an alternative to the MOS transistor for implementing future low power FPGAs logic blocks.

# REFERENCES

- [1] Navid Azizi and Farid N. Najm, "Table Leakage Reduction for FPGAs", IEEE Custom Integrated Circuits Conference (CICC), September 2005.
- [2] E. Kusse and J. Rabaey, "Low-energy embedded FPGA structures," in Proc. Int. Symp. Low Power Electronics and Design, pp. 155-160, Aug. 1998.
- [3] J. H. Anderson and F. N. Najm, "Low-Power Programmable Routing Circuitry for FPGAs," ICCAD, pp. 602-609, 2004.
- [4] Pradeep S. Nair, Santosh Koppa, Eugene; E5. John, "A comparative analysis of coarse-grain and fine-grain power gating for FPGA lookup tables," 52<sup>nd</sup> IEEE International Midwest Symposium on Circuits and Systems, pp.507-510, 2009.
- [5] Arifur Rahmanl, Satyaki Dasl, Tim Tuanl, and Steve Trimberger, "Determination of Power Gating Granularity for FPGA Fabric," IEEE Custom Integrated Circuits Conference, pp.9-12, 2006.



[Mali\*et al., 5(8): August, 2016] ISSN: 2277-9655 IC<sup>TM</sup> Value: 3.00 Impact Factor: 4.116

- [6] A. Lodi, L. Ciccarelli, D. Loparco, R. Canegalloa, and R. Guerrieri, "Low leakage design of LUT-based FPGAs," in Proceedings of the 31st European Solid-State Circuits Conference (ESSCIRC), pp. 153–156, Sept 2005. 2005.
- [7] S. Pable, A. Imran, M. Hasan, and A. Islam, "Performance optimization of LUT of subthreshold FPGA in deep submicron," in *IEEE* International Conference on Computer and Communication Technology (ICCCT), (Allahabad, India), pp. 64–69, Sept 2010.
- [8] Kureshi Abdul Kadir and Mohd. Hasan, "Low Leakage High Speed Carbon-Nano Tube Field Effect Transistor Based Look Up Table" in First International Conference on Emerging Trends in Engineering and Technology, IEEE 2008.
- [9] S. D. Pable, A. K. Kureshi, and Mohd. Hasan, "Robustness comparison of emerging devices for Portable applications" Hindawi Publishing Corporation Journal of Nanomaterials Volume 2012, Article ID 242459, 8 pages
- [10] P. A. Gowri Sankar. and K. Udhayakumar "MOSFET-like CNFET based logic gate library for low-power application:a comparative study" Journal of Semiconductors Vol. 35, No. 7 july 2014
- [11] A. Raychowdhury, S. Mukhopadhyay, and K. Roy, "A Circuit-Compatible Model of Ballistic Carbon Nanotube Field-Effect Transistors," Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, 2004 vol. 23, pp. 1411-1420.ium on Circuits and System, pp.1129-1132, 2009
- [12] Jie deng, "Device modeling and Circuit performannee evaluation For Nanoscale Devices: Silicon technology beyond 45nm Node and Carbon Nanotube Field Effect transistors", Ph.d thesis, Stanford University, 2007
- [13] T. Tuan and B. Lai. "Leakage power analysis of a 90nm FPGA," in Proc. IEEE Custom Integmted Circuhs Conf, pp. 57-60, 2003.
- [14] S. lijima, "Carbon Nanotubes: past, present, and future," Physica B,323, 1-5, (2002).
- [15] A. Raychowdhury, A. Keshavarzi, J. Kurtin, V. De, and K. Roy, "Carbon nanotube field-effect transistors for high-performance digital circuits—DC analysis and modeling toward optimum transistor structure," IEEE Trans. Electron Devices, Vol. 53, No. 11, pp. 2711-2717, 2006.